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^ (57) Abstract: A network management system is provided that allows a user to configure multiple devices according to a consistent 
O set of policies. The system includes a device learning module that can read configuration data from a network device and auiomal- 
O ically match that configuration data to existing policies and components of policies within the system. The device learning module 
^ also identifies unknown configuration data, which does not match any existing policy. The system further includes a grammar builder 

that can parse the unknown configuration data and construct a component or policy from the unknown data, by matching the un- 
^ known data to a grammar of configuration commands for the network device. The system also provided auditing capabilities, where 

policies are compared to running network configurations, and differences arc identified. 
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METHODS AND SYSTEMS FOR CONTROLLING 
NETWORK INHlASTRUCnJRE DEVICE 

Background of the InventioD 

[0001] Enterprise networks today are composed of hundreds to thousands of network devices 
5 arranged in such a way as to connect sites together, and provide both internal network resources as 
well as Internet access to employees. Service provider networks are even larger, often composed 
of tens of thousands of network devices. These devices include routers, LAN switches, and 
firewalls, in addition to other types of speciahzed devices (e.g., bandwidth perfoiniance 
measurement and contml, traffic "load balancers," etc.). 

10 [0002] Virtually none of these devices are functional within the networic when removed fix)m their 
shipping boxes. Each device has die hardware necessary to perform its function, and each device 
typically has software which handles any higher-level processing as well as presoiting a 
configuration interface to users. Generically, this software is referred to as the device's "operating 
system". For some devices, the operating system presents few options to the user and thus requires 

1 3 little seti^ before the device is functional within the network; an example would be the low end 
LAN Ethemet svntches which are ubiquitous in many netwoiks today. For high^-end devices, 
such as those which run the '^backbone" of most enterprise or Internet networks, or devices within 
the corporate data center, the operating system can present a truly vast array of options which 
govem device functionality. Thiese options generally must be configured by the user before flie 

20 network device is useable. 

[0003] Routers and switches from vendors such as Cisco Systems, for example, can require tens to 
thousands of individual configuration commands in order to function within the network. At the 
simplest level, the number and combination of commands required on a specific device is a 
function of its role in the network, the network protocols used, the number and type of connections 
25 handled by the devices, and security measures employed on the network. 

[0004] The process for creating tibese configurations starts with the ov^l network design, which 
is often expressed by network engineers in diagrams of the physical network along with knowledge 
of which network protocols and other functionality is in use. Within organizations wifli more 
rigorous standards, configurations for classes of devices are often "templated," vnth examples of 
30 configurations set up by senior engineers and th«i used (with ^propriate device-specific data) by 
engineers in the field who deploy and maintain devices. Templates may be stored in a version- 
control system, in order to track changes, hi less rigorous organizations, templates may merely be 
word-processor documetits or memos which oufline standards for device configuration. 
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[0005] Once configurations are designed for each netwoik device, flie commands making up the 
configuration must be £5)plied to the device itself. Several mechanisms exist for doing so, ranging 
from typing the configuration commands into the device's command-line interpreter to a number of 
ways to "download" the configuration commands all at once (TFTP, FTP, Secure Shell copy, etc). 
S This process is then repeated for all devices that need to be updated. 

[0006] The processes described above are tedious, time-consuming, and error-prone. Attempts 
have therefore been made to automate the routine aspects of these processes. Nearly every large 
network engineering organization, for example, will likely possess "scripts" or small programs - 
typically written in-house - to retrieve and possibly even "download" configurations to multiple 
10 devices, change passwords, or constract lists of devices and configurations. Scripts of this type 
simply replace the repetitive aspect of logging in to many devices and typing the same commands 
repeatedly. 

[0007) Commercial attempts to automate device configuration essentially start fi'om this basis. 
Tools such as Resource Manager Essentials (part of the CiscoWorks family of networic 
15 management applications) provide access to the configuration text within a web browser interface, 
and allow any edits to be easily deployed to the running network device. Such systems typically 
also save each version of a configuration and allow users to view this history, displaying 
differences between configurations in a visual format; history is also usefiil for rolling back to an 
earlier configuration version and effectively erasing mistakes within the netwoik. 

20 10008] Li the description above, no mention was made of providing tool sq>port during tiie process 
of designing the configuration. The simplest device configuration systems don't provide such 
support. Designing network configurations is left to engmeers to work out by traditional methods, 
and then the design is manually translated into individual configurations. The applications simply 
allow the engineer to edit the configurations safely - i.e., within the context of the tool, before it is 

25 distributed to the netwoik devices themselves. The sinq>lest applications also manage devices 
individually - in other words, if changes need to be made to 100 device configurations, ttie tool 
would be used 100 times to edit 100 individual configurations. This may provide the safety of 
working offline but it does not make the process any more efficient. 

[0009] One sqpproach to providing efficiency is to allow a common set of commands to be 
30 deployed to many devices at the same time, achieving a form of mass configuration. This 

capability goes by many names commercially (e.g., Command Sets in Intelliden's product), and is 
referred to as a * template" herein. In the simplest implementations, templates are a set of complete 
commands (i.e., with data values filled in) which can be deployed to a set of netwoik devices. In 
more complex implementations, ten5)lates may allow "variables" which serve as placeholders for 
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data which need to be filled in based upon the individual device. Often, the user is prompted to fill 
in data values for each device, or provide a list of values. 

[0010] Several lindtations exist for the **pure template" approach as described above. With 
templates that are composed of complete configuration commands, it is difficult to create a well- 

5 factored set of truly generic tanplates which can reflect your network design in a '"normalized" 
fashion. The temi '*normalization" is borrowed &om flie field of database design, and refers to the 
situation that exists when each piece of data is reflected in only one place within the database, and 
all pieces of data may be retrieved by a single well-defined query (however complex it might turn 
out to be). Normalization is an important goal for network configuration as well, because it reflects 

10 the state when each piece of network fiinctionality can be changed with the minimum of effort and 
the minimum impact on unrelated devices or functions. 

[OOllJ In a generic or ''normalized" set of configuration traiplates, each aspect of the network 
design would exist in a single place. All devices which incorporated lhat aspect of network 
fimctionality could be \q)dated by editing and re-deploying that template. In order to accomplish 
1 5 this, templates caimot simply contain configuration conmiands; two other c^abilities are useful in 
order to achieve full normalization. 

[0012] First, templates should not incorporate complete commands, but should instead be 
somewhat abstract This capability allows t^rq>lates can be adapted to the specific devices onto 
which the functionality will be deployed. The user experience may or may not be wholly abstract, 
20 but the underlying template should be stored as abstract versions of commands, with a process to 
translate them into fimal configuration commands appropriate to each device. 

[0013] Second, templates should allow for data references and queries, with a sufficiently rich 
ability to cross-reference data within and between devices. This allows an abstract template (as 
discussed above) to incorporate data values which are appropriate to a given device - nothing need 
25 be "hard-coded." This capability allows each piece of data (e.g., an Ethemet interface IP address) 
to exist in one location, and simply be referenced everywhere else. Whenever such a source behind 
a reference changes, each reference to that data should be updated as well. 

[0014] Thus there is a need for systems and methods of providing a normalized management of 
network designs and configurations, and allowing technologies for design to be algorilhmically 
30 linked to actual device configurations. 

Summary of the Invention 

[00151 Li an embodiment, software systems and methods for managing the configurations of all 
devices on a network, through subscriptions to a common database of policies, are disclosed. 
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These policies are an embodiment of "nomialized" configuradon templates as discussed in the 
previous section; policies thus ddapt themselves to the specific device being configured, and can 
allow for device-specific data references and qumes. 

[0016] Embodiments of the invention also maintain control over policy and data modifications, 
5 providing a complete version history for each element managed in the database. The method of an 
embodiment is thus a policy-driven or policy-based method for managing network device 
configurations. The system also incorporates a method for automatically updating the database of 
policies, using a learning system that incorporates new syntax whenevCT mcountered. 

[0017] The system of an embodiment also provides equabilities that bridge the gap between 
1 0 configuration control and network monitoring. Because the system can analyze a native device 
configuration and return the list of policies implemented, it can continually re-analyze devices and 
monitor changes to devices at the policy level. Using this technology, the system alerts users when 
devices fail to implement the intended pohcies or when changes made outside the system, such as 
manual changes made by network engineers, cause divergence among devices. 

1 5 10018] Additionally, embodiments of the invention are designed to overcome the limitations of a 
**pure template" approach, provide **normalized" management of network designs and 
configurations, and allow technologies for design to be algoritfamically linked to actual device 
configuratioris. 

[0019] Another aspect of an embodiment of the invention is the degree to which the structure of 
20 the configuration, as well as the semantics of configuration commands, are parsed and understood 
by the automation tool. Earlier ^preaches to the problem treat commands and configurations as 
blocks of text which have meaning to the human user, and to the network device, but not to the 
automation tool. Another aspect of an embodiment of the invention is that device configurations 
are written m a 'Regular language,*' and thus are amenable to the standard tools of hnguistic 
25 parsing, analysis, and generation. 

[0020] Configuration comprehension is realized in an embodiment of the uivention by the use of a 
compiler which handles both the mcoming parsing of native configurations and outgoing 
production of new native configurations. Because embodiments of fee invaition are designed to 
control many different types of hardware fiom multiple vendors, this compile is modular, allowing 
30 the same "source code" (e.g., a tree of configuration elements) to be translated into different 
"executables" (e.g., the specific configuration languages of different vendors). 

[0021] Some vendors (e.g., Cisco Systems) have many product lines, often with different operating 
systems and different command languages. Each vendor (and operating system) supported by 
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embodiments of the invention tfius has a fomial grammar, which was initially produced by hand. 
For each unique combination of vendor and language, a foimal grammar and methods for 
interacting with the device to retrieve and update configurations are used. 

[0022] hi addition, aspects of embodiments of the mvention allow grammars to be extensible at 
5 runtime, since vendors fi?equently add new commands whenever new hardware or new 

functionality appears within a product line. Within embodiments of the invention, grammars are 
expandable without additional programming, because the parser is designed to recognize (and 
isolate for analysis) sections of native configurations which do not match any known device 
configuration command. Segments of native configuration representing unknown syntax can then 
10 be tumed into fiill grammar through a system for discovering and automatically writing new 

grammar segments. These new grammar segmaits can then be inserted into the grammar database 
and used immediately for parsing incoming native configurations or compiling new configurations 
for output to a network device. 

[0023) At a high level, the system of an embodiment can be broken into two major fimctional 
1 5 areas. First, the syst^ allows large numbers of network devices to be configured and controlled 
using flexible policies which are easily o^eated by usm of the system without writing any 
programming code or understanding the inner workings of parsers or compilers. Second, the 
system incorporates innovations that are designed to automatically incorporate new information 
about changes that hardware vendors make to their product lines, without requiring an update to the 
20 system code. 

Description of the Drawings 

[0024] The accompanying drawings are included to provide a fiirther understanding of 
embodiments of the invention and together with the Detailed Description, serve to explain the 
principles of the embodiments disclosed. 

25 (00251 FIG. 1 depicts a networic managanent system in accordance with an embodiment of the 
invention. 

[00261 FIG. 2 depicts an instance tree for a poHcy-driven configuration. 

I0027J FIG. 3 depicts a policy for use with a poUcy-driven configuration. 

[00281 FIG- 4 depicts a native configuration as parsed by the device learning system of an 
30 embodiment of the invention. 

(0029! FIG. 5 depicts the stages of parsing a native configuration into a policy-driven 
configuration according to an embodiment of fiie inv^tion. 
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(0030) FIG. 6 depicts a method of parsing a native configuration. 

[00311 FIG. 7 depicts a method of identifying policies contained in a parsed configuration. 

[0032] FIG. 8 depicts a method of handling parsing errors. 

[00331 FIG. 9 depicts an instance tree containing recognized components and unknown regions. 

5 [0034] FIG. 9A depicts an unknown region contained witinn a recognized conq>onent. 

[0035] FIG. 10 depicts a method of processing an instance tree to recognize candidate 
components. 

[0036] FIG. 1 1 dq)icts a generalized method of resolving candidate components into components. 

[0037] FIG. 12 depicts a method of creating an abstract syntax tree for a command root 

1 0 [0038] FIG. 13 depicts an abstract syntax tree created according to the method of FIG. 12. 

[00391 FIG. 14 dq)icts a method of transforming an abstract syntax tree into a grammar for a 
component. 

[0040] HG. 1 5 depicts a method of identifymg command boundaries within a grammar tree. 

[0041] FIG. 1 6 depicts a method of discovering command-level semantics caused by alt^ations to 
IS configurations. 

[0042] FIG. 17 depicts a method of identifymg default values and equivalencies in command 
attributes. 

[0043] FIG. 18 depicts a method of identifying attributes which can create unique instances of a 
component. 

20 (00441 FIG. 19 dq)icts a method of identifymg addition dependencies in a configuration. 

[0045] FIG. 20 dq)icts a method of identifying r^oval dependencies in a configuration. 

[0046] FIG. 21 depicts a component having multiple alternative sets of syntax blocks. 

[0047] FIG. 22 depicts a method of compiling a poKcy-driven configuration into a native 
configuration. 

25 [00481 FIG. 23 depicts a method of applying a native configuration to a network device. 

[0049] FIG. 24 depicts a method of auditing a native configuration against a policy-driven 
configuration, to detect differences between the two. 

[0050] FIG. 25 dq)icts a method of auditing a native configuration to ensure network design 
consistency is maintained. 
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[0051] In the system of an embodiment shown in FIG. 1 , network device data stractures 12 are 
data structures that rq)resent physical devices 10. Examples include routers, switches, or firewalls. 
Each physical netwodc device 10 is represented by a networic device data stracture 12, which is 

5 stored in the network device database 14. Customers purchase a software Kcense which enables a 
fixed number of device data structures 12 to be created and stored in the device database 14, 
Additional licenses to create and store device data stractures 12 can be purchased throughout the 
lifetime of the product Each device data stracture 12 contains metadata (information) concerning 
that device 10, such as information about the device vaidor, software operating system or 

1 0 command language version, and the appropriate methods and authentication credentials for 
executing commands on the device 10. Each device data stracture 12 also contains a native 
configuration for the associated netwo± device 10. Furthermore, the network device data structure 
contains pointers to user-created metadata about the device. These metadata include categories and 
groupings usefid for organizing a large number of devices, as well as for creating pohcies. 

1 5 [0052] Each network device 10 is associated, via a network device data stracture 12, with zero or 
more Policy-Driven CTD'O configurations 16, each of which represents a complete set of 
directives needed for the physical network device 10 to fimction m an intended manner. These PD 
configurations 16 are stored in a component database 28. hi an embodiment, a network device has 
one "active" configuration at any time, and the user can switch active status between any of the 

20 stored PD configurations 16 associated with a device 10 at any point in time. Li an alternate 
embodiment, a network device 10 can have more than one active configuration 16. 

[0053] Policy-driven configurations 1 6 are data stractures which represoit the total desired state of 
a network device 10 within the system. PD configurations 16 contain references to a set of 
instances 20, policies 34, and device data stored in persistent storage 22. The instances 20 are 
25 stored m the component database 28. 

[0054] In terms of implementation, PD configurations 1 6 are a set of references or pomters to 
instances 20 of components 26 stored elsewhere m the component database 28, or pohcies 34. 
Components 26 are not directly used by device configurations 16. Instead, following object- 
oriented practice, "instance" objects, i.e. instances 20, are created whenever a component 26 is 
30 attached to a PD configuration 1 6. Instances combine a reference to a component 26, and device- 
specific data stored in the device data storage 22. Policies 34 are persistent groi?)s of instances 20 
which can be reused across many PD configurations 16. 

[0055] If an uistance 20 is created purely to serve within the context of a single device 10, the 
system creates a **private*' or anonymous instance 20 of the component 26, which contains both 
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syntax and references to device-specific data which are retrieved fi:om storage 22 in the process of 
resolving data references set up in the grammar. Private instances do not show up in the catalog 46 
of components displayed to the user for reuse, since private instances are not reusable. 

[0056J If an instance 20 is created to be used on more than one device 10, the system creates a 
5 policy 34 which is a public instance of the component Policies may be reused on any number of 
devices 10, and may include entire collections of instances 20 and data references. Policies thus act 
like **templates" which aggregate together functionality, saving manual configuration effort and 
increasing consistency and accuracy across the customer's network. Policies are displayed in the 
catalog 46 of componaits for use by flie user, and stored in the component database 28. 

1 0 [0057] Some of these data references may be partially filled because their values are not device- 
specific (e.g.., routing protocol parameters which are constant across devices but need to be 
customized for the user's particular network), while other data references are resolved for each 
device configuration 16 to which the policy 34 is attached. 

[0058] Policies 34 are the means by which configurations can be factored into larger-scale units 
15 and reused. Policies 34 create a "change once, apply everywhere" semantic to network device 
configuration, and are the principal mechanism for decreasing the effort required to run a network 
using the system. When policies 34 are added firom PD configurations 16, we keq? a database 
record of the policy linkage 35. This linkage is used in advanced device monitoring and auditing, 
as described below. When a policy 34 is removed firom PD configurations 16, the appropriate 
20 poUcy linkage 35 is removed &om the database record of policy linkages. 

[00591 Turning to FIG. 2, instances 20 m a PD configuration 16 are organized in a strict tree 30 
which organizes instances 20 into a series of containers 32 which correspond to vendor-neutral or 
abstract networking concepts. This tree 30 does not necessarily correspond to the topology of the 
actual grannmar of the vendor's command language as stored in the totality of syntax blocks stored 

25 in instances 20. The mapping between the two is handled by custom directives embedded in a 
grammar specification language, which allow reorganization of the component tree 30 along \yith 
subsequent compilation using the "correct" set of syntax (derived from the instance tree 30), as 
discussed in detail below. The containers 32 are present in the catalog 46 of components and 
policies, and are used to construct a human-readable representation of the catalog within the user 

30 interface 40. 

[0060] Turning to HG. 3, a policy 34 represents a reusable set of instances 20. At the top of FIG. 
3, a single instance 20 is expanded to show its internal structure. Within the instance 20, is a 
collection of syntax blocks 36, for example one or more configuration directives, possibly 
associated with configuration data 38. Pohcies 34 may contain other policies as well, which means 
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that an instance often points to zero or more child sub-policies 40. When compiled into a native 
configuration, instances 20 and any sub-policies 40 which are included in a policy 34 are compiled. 
This behavior allows the creation of reusable policies which lessen the work required to create 
standardized sets of network devices. 

5 10061) At the lowest level, the system of an embodiment contains a set of component syntax 
blocks 36 for a given network device vendor or configuration language. These components are an 
object-oriented view of the grammar specification for a given configuration language, and as such 
are abstract. In other words, device-specific data 38 is usually not associated with the component 
syntax blocks 36. In alternate embodiments, however, device-specific data 38 maybe associated 

1 0 with a component syntax block 36, for example if the configuration language itself is device- 
specific. References to device-specific data 38 are denoted in the syntax block 36 by a 'Variable" 
or **placeholder" granraiar constmct that indicates that a position within the syntax block 36 is to be 
filled in with the results of a database query into device data storage 22, for example when flie 
component 26 containing the component syntax block 36 is instantiated into an instance 20. 

1 5 Component syntax blocks 36 are editable using a compon^t editor 41 within the user interface 40 
using a simphfied graphical method for adding, deleting, and modifying syntax block elements. 

(0062) Component syntax is created in sevaal ways - by direct creation within flie user interface 
40 using a conxponent editor 41, by downloads 42 received fix>m an outside source such as a 
manufacturer of the system or a third party component creator, or by Grammar Builder 45. 

20 Grammar Builder 45 allows the system to *leam" new syntax by analysis of candidate components 
58 for syntax that is not recognized as part of the existing component database 28. Grammar 
Builder 45 is described m detail below. 

(0063) Physical devices 1 0 possess a single running configuration at any one time - the set of 
commands, language directives, and data used by onboard operating system software or firmware 

25 to produce the running behavior of the device. This is referred to as a "native configuration". 

Some devices can store alternative configurations in memory or persistent storage (e.g., Cisco lOS 
devices store startup configurations in NVRAM, and these can be separate in some cases firom the 
running configuration in RAM). Native configuration refers to the set of commands, directives, 
and data stored on a physical device, whether running or altanate. Native configurations are 

30 retrieved, stored in the device data structures 12, and analyzed during device registration, and are 
created by the DLS 44, for loading onto the network device 10, when compiling a PD 
configuration 16 during preview or task execution. Native configurations may also be revised 
directly on the network device 10, for example by an engineer performing a manual update 48. 
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Manual updates may occur during troubleshooting or in order to install a change recommended by 
the network device vendor. 

[0064] Policy-driven configurations 16 (as well as components and data) are version-controlled 
widnn fho system. These entities are edited by checking out ttie entry into a local workmg area 
5 within the user interface 40. This working area is referred to as a *Svoikspace," and workspaces 
can be personal or shared by a group of users for collaborative work. Entities which are edited 
within a workspace are then checked in, creating a new persistent version of the entity. Users can 
browse the history of each entity, and roll back the current state of an entity to a previously stored 
version. 

1 0 [0065] Editing is done in the context of a "job", which serves as a container within the user 

interface 40 for organizing the work needed for accomplishing a real-world project. Examples of 
projects range in scope jfrom "deploying a new Ethernet switch" to "create an enterprise-wide mesh 
of VPN tunnels." Projects begin with the editing of entities within a workspace - for example, PD 
configurations 16 or policies 34, and are finished wh^ each device 10 requiring update has 

1 5 received the changes which result fiom such edits. 

{0066] Withm a job, edits to a policy 34 may affect many different network devices. 
Dependencies between network devices and policies 34 are maintained wiUiin the system (as a 
series of policy linkages 35), so that a task may be created for each device 10 affected by edits to a 
policy 34. C3ianges to private instances 20 within a PD configuration 16 also trigger the creation of 
20 a task for updating the network device 10, Tasks are workflow items, owned by a user of the 
system and requiring resolution before a job is completed. 

[0067] The system of an embodiment is desigued to automatically track changes made by network 
hardware vendors to their command languages and syntax. Previously, products either forced the 
human user to track vendor changes, or wait for the software solution vendor to produce product 
25 updates. 

[0068] The process of importing netwoilc devices 1 0 into the system may involve both making 
entries into a device and license inventory database 14, and the retrieval and analysis of the native 
configuration running on tiie device 1 0 at flie time of import The latter activity is performed by the 
Device Learning System 44, as discussed in detail below. 

30 [0069] The database 14 of basic device information is a standard SQL database used to record the 
name and other metadata concerning each network device 10. Examples of metadata include the 
location and model number of each network device 10. These metadata are used for grouping and 
sorting functions within the system's user interface 40. 

10 
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[0070] In an embodimeiit, entries made in the device inventory database 14 are tracked against the 
customer's purchased license. A "grace period" is activated when the inventory reaches the total 
purchased license, allowing the customer to exceed their paid license account for a temporary 
interval while they acquire additional licenses from the system vendor. This feature is for customer 
5 convenience, and can be disabled within the system if deemed desirable. 

(0071] Importmg native configurations fix>m the running device 10 accomplishes two goals. First, 
import of the existing native configuration saves a significant amount of re-work by customers, 
thus easing adoption and speeding the utility of the system for customers. Second, importation and 
subsequent analysis of flie parsed configuration is usefid in Grammar Builder 45 ~ which allows 
1 0 for extending the database 28 of components and policies widiout significant manual effort on the 
part of the customer or vendor. 

I0072J Turning to FIG. 4, in an example of the operation of the Device Learning System 44, a 
native configuration 50 imported fi-om a running netwoik device 10 will contain sets of 
configuration commands 52 that already exist in recognized component form 54 within the system, 
15 as well as some constructs 56 which are not represented by recognized con^nents 54. Those 
constructs 56 which are not represented by recognized compon^ts 54 are subsumed by candidate 
components 58 which can be later analyzed by Grammar Builder 45. 

[0073] Tiraiing to FIG. 5, a high level view of the stages of the Device Learning System 44 are 
showa The Device LeaniingSystaaa 44 begins with flie native configuration 50 as an iiq^ The 

20 native configuration 50 is provided to a lexer module 60, where each of &e literal strings in the 
native configuration 50 is assigned a token ID (tokenized). The tokenized configuration is emitted 
as a data stream to a parser 62, which parses the configuration into either recognized components 
54, or candidate components 58. The parser 62 is configured using the components 26 in the 
component database 28, such fiiat the parser will recognize any components in the native 

25 configuration 50 which match components 26 stored in the conq)onent database 28. The lexer 60 
is also configured using the componoats in flie component database 28, such that the lexer 60 will 
recognize the tokens used in the grammar embodied in the components 26. Then, the set of 
components is analyzed by policy matcher 59 to determine which, if any, poUcies 34 are 
represented. The result of Device Learning System 44 analysis is a Policy Driven Configuration 16 

30 and zero or more candidate components 58. 

[0074] Configuration analysis and configuration compilation use the same parsing engine. This is 
done to allow the component stmcture to be used symmetrically - either in parsing and analysis of 
an existing device 10, or to be used as a specification for ^tting a new native configuration 50 at 
compile time. In an embodiment, the parsing engine uses a custom granmiar specification 
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language discussed below, rather than a Y ACC-stjde grammar specificadoa Alternatively, a 
YACC-style grammar specification may be used, 

[00751 The custom grammar specification language of an embodimmt uses a very close coupling 
of the lexing (tokeoization) and parsing functions in order to deal with complex configuration 

5 languages - many of which were not "designed" but rather evolved over many releases. In contrast 
to a typical YACC-style grammar specification, where semantic actions are explicit within the 
grammar, the custom grammar specification language avoids explicit semantic actions in order to 
use the same grammar specification for both analysis and configuration compilatioa Semantic 
actions refer to the code executed when a specific syntactic constmct is matched by the parser 62 - 

10 the action might be to insert the parsed data into a data structure, or to execute some apphcation 
fimctionality, for example. The custom grammar specification closely couples the lexer and parser 
in order to implement one step in the analysis of candidate components 58. 

[0076] Wifiun the system, semantic actions are left implicit in the grammar specification, and are 
inferred based on the type of parser being constructed - an analysis parser 62 (for analyzing 

1 5 existing configurations) or a compiler (for creating configurations &om components). In the case 
of an analysis parser 62, semantic actions include creating an in-memory tree rq)resentation of the 
configuration syntax, along with "actions" which reorganize the tree and insert parsed data into the 
appropriate class objects as member variables. In the case of a generated compiler, semantic 
actions include resolving data references and emitting •'instructions" in the form of syntactically 

20 correct configuration comanands m the relevant device vendor's language (e.g., Cisco lOS). 

[0077] In order to create a learning effect, flie analysis parser 62 is fi-eshly constructed prior to 
importing the native configuration 50 fifom a new device 10 (althougih caching can be used as an 
optimization where ^propriate). In other words, the contents of the component database 28 are 
used to constmct the lexer 60 and parser 62 anew for each run. This means that as components 26 
25 are added - either directly in the GUI or through Grammar Builder 45 - the parser 62 becomes 

incrementally richer and better able to recognize the user's configurations at a component or policy 
level. 

[0078] With reference to FIG. 6, in the first step 610, the grammar specification (in NC format) is 
scanned to develop a moping of literal strings into lexer tokens. This mapping is used to generate 
30 the source code for the lexer module 60, which is used by flie parser 62 to scan the native 

configuration text at a low level and return a stream of tokens rather than literal ASCII text. Both 
the lexer 60 and parser 62 source code are created by combinmg the processed grammar 
specification with source code templates which contain common constructs which are invariant 
firom run to run of the system. 
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[0079] Jn the second step 620, the source code for flie parser 62 is generated, by iterating over the 
grammar rules contained in each component 26 and creating a YACC-conq)liant rule. For each 
grammar rule contained in each component 26, the system generates a YACC rule \^ch matches 
tokenized syntax seen in the configuration being analyzed. The system also generates an 
5 appropriate semantic action for each rule. In the case of configuration analysis, semantic actions 
involve instructions for building an in-memory representation of instances of recognized 
components 54 as well as insertion of data into objects as m^b^ variables, plus some 
reorganization of the resulting 'instance tree." Grammar rules are sometimes rewritten to turn the 
more compact and expressive Extended Bakus-Naur Form (EBNF) syntax specifications into 
1 0 YACC-style BNF (Bakus-Naur Form) specifications. Rewriting is done whoieva: necessary. 

[0080] Once the source code for the lexer 60 and parser 62 are generated, at step 630 each is 
compiled and then dynamically loaded by the Device Learning System 44. The system is now 
ready to the parse the native device configuration 50, which has been previously retrieved. The 
generated parser may be LALR (Look Ahead, Left Recursive), or alternatively may be a GLR 
1 5 (Generalized Left Recursive) parser, in order to allow resolution of certain ambiguous conmiand 
syntaxes found in some native configurations. 

[0081] At step 640, the parser 62 is run with the native configuration 50 as input, with parsing 
occumng in a fairly normal fashion, except for handling of parse errors. As the parsCT 62 receives 
a stream of tokens fix)m the lexer 60, it matches sequences of tokens which form rules in the 
20 grammar which was synthesized fiom the current state of the component database 28. 

[0082] Each rule corresponds to the syntax 36 of a single component 26, expressed in YACC-style 
specification. When a rule is matched in the native configuration input, a component instance 
object is created and added to an instance tree maintained by the parser 62. In addition, semantic 
actions are triggered which handle additional instance construction activities, such as copying 
25 parsed data values into object member variables. 

[0083] In conventional parsing implementations, a parse error would indicate that the parser 62 
encountered syntax which is illegal given the parser's grammar definitioa hi the parser 62 of an 
embodiment, since the parser grammar is generated firom the library 28 of components, parse errors 
represent configuration commands that are not yet part of the component database 28. Thus, parser 
30 exceptions which result &om unknown grammatical constructs are handled in a separate stq), to 
create candidate components 58 before the configuration is given to flie user for viewing and 
editing. Recognition of candidate components 58 through parse error handling is described in 
detail below. 
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[0084] Once the parser is finished parsing the native configuration 50, at step 650 it returns a data 
structure called an 'instance tree.'* This data structure contains instances 20 which house not only 
the parsed syntax but also named member variables that were recognized during parsing, oi^anized 
in strict conformance to the topology of the original granmiar specification. 

5 [0085] The instance tree is relatively flat after parsing, and thus is reorganized along a number of 
dimensions at stqj 660. Hints in each component's stored syntax are used to move instances 
around within the instance tree. This is done, for example, to group together related instances (e.g., 
ACL entries, routing advertisements). Tree reorganization serves to both enhance user 
comprehension, and also provide hooks for implementation of vendor-neutral component 

1 0 relationships and other post-processing based on metadata Following recognition of candidate 
components 58 at step 670, the tree is then compressed to remove empty instances following 
reorganization and generally collapse redundancies at step 680. Again, this is done both to enhance 
user comprehension, and also to provide a post-processing hook for cleanup following other post- 
processing based on metadata. 

1 5 [0086] Finally, at step 690 the instance tree is analyzed at the policy level to detect nodes which 
represent instances of policies 34 stored in the component library 28. Policy analysis begins with 
the instance tree as it exists aft^ parsing and reorganization. This instance tree contains ref^^ces 
to base components 26 and associated data found during parsing. At the level of policies, which 
can span many devices, none of the recognized components 54 recognised during parsing are yet 

20 understood 

[00871 Recognition of policies occurs by attempts to match policies against the instance tree, as 
shown in the method of FIG. 7. For each policy 34 stored in the database 28 and retrieved at step 
705, we attempt to match the first sub-component contained in flie policy against the device 
instance tree at step 710. If no match occurs, then the poUcy is not represented on a device and we 
25 abort to the next poUcy in the hst (because the entire policy must match to be recognized) and 
return to step 705. 

[00881 At step 715, if the first contained sub-component is matched at one or more places in the 
instance tree, we then look at two sub-cases. Some policies simply aggregate a set of components 
and data for use as a **package" or policy. In these cases, the order in which componmts are 
30 recognized in the tree does not matter. Mother cases, however, order matters. Route m^s or 
access control lists (both of which are rq)resented as poUcies) aggregate together components, but 
do so in a particular order. Processing branches to step 720 for unordered policies, and to step 730 
for ordered policies. 
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[0089] For an unordered policy, at stqp 720, the ne3ct sub-component in the instance tree is checked 
to see if it matches a component in the policy. If not, we abort to (he next policy to be recognized, 
at step 70S. Vfhm we find a match between the first sub-coniponent of a policy and the device 
instance tree, it is sufficient to snnply find matches for the remainder of the policy sub-components 
S elsewhere in the instance tree within the sub-tree of the node which contained the first component 
match. The latter requiremrat prevents situations where trivial matches in widely disparate 
segments of the tree are misinterpreted as the presence of a policy on a device. As each component 
is matched to a policy, a check is made to determine if the policy is fiiUy matched, at step 725. If it 
is, then processing advances to step 740, otherwise the next component in the instance tree is tested 
10 atstq)720. 

[0090] For an ordered policy, at step 730, the next sub-component in the instance tree is checked 
to see if it matches a component in the policy. If not, we abort to the next policy to be recognized, 
at step 705. Once we match the first sub-component within the instance tree, we thai walk the 
remainder of the sub-components in order to determine if they match the instance tree in the correct 
1 5 order. Only if all components contained within a policy match components found in the instance 
tree, in the correct order, is the poUcy called a '*match." As each component is matched to a policy, 
a check is made to determine if the policy is fully matched, at step 735. If it is, then processing 
advances to step 740, otherwise the next component in the instance tree is tested at step 720. 

[0091] Assuming that a policy is matched in the instance tree, we then replace the original 
20 components within the instance tree with the policy itselj^ at step 740. This is done by finding the 
site in the instance tree at which the first sub-conq)onent of the policy matched the instance tree, 
and rq>lacing that component with the policy itself All other sub-components which matched are 
tfien simply deleted from the instance tree in order to prevent duplication. 

(0092] This process is repeated for all policies in the policy database. The end result of this 
25 analysis is an instance tree which contains any policies which the device implemmts, any 

components which are identified within the device configuration but are not part of a policy, and 
any candidate components which are recognized for the first time. Such an instance tree is 
considered "complete" and is the output of the Device Learning System 44, and is ready to be 
stored persistaitly to the component database 28 and within a version control database. 

30 10093] Once post-processing is complete, the instance tree for a device configuration is complete, 
and can be persistently stored to versioned storage within the system. At this point, the 
configuration represents a PD configuration 16. 

[0094] In an embodimmt, the instance tree is persisted to an XML data format, and stored as a file 
in a version control database. Once stored in version control, we can reconstruct the change history 
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of the configuration between any two editing sessions (so long as an editing session is saved and 
not abandoned or purged). 

[0095] In the t^ above, we described the overall method of an embodinoient, for parsing the native 
configuration and recognizing instances 20 of library components 26. We also described the fact 
S that parse errors indicate that the parser has encountered configuration syntax that is not contained 
within any recognized component 54. Thus, the syntax errors themselves represent a significant 
source of information about potentially new coroponents. The method for handling parse errors is 
described in detail below. 

[0096] Under normal usage, a conventional parser stops parsing when a syntax error is 
10 encountered, and reports the error and the offending syntax to the caller. This approach is typically 
seen within compilers, for exanq)le. Thus, the location of the error is known. 

[0097] Wifliin the parsing step 640 performed by flie Device Learning System 44, parse errors are 
processed according to the method of FIG. 8. Parse errors are trapped and marked for later post- 
processing at stqp 810. Insteadof adding a normal component instance to the output tree, an 
1 5 •'unknown" region marker is added to mark the spot, at step 820. Parsing then continues at step 
830 until the next error occurs or the end of the input configuration SO is reached. The resulting 
output is therefore an instance tree 71, which is a tree-organized data structure, composed mostly of 
conq)onent instances 72 with data, and an occasional '^mknown" region 74, which marks a place 
where un-parseable syntax occurred. 

20 [0098] At this stage, however, the Device Learning System 44 does not know the extent or 

contents of the ''unknown" region 74, since this syntax was un-parseable and no semantic actions 
were taken. Only the location is known. Resolution of ''unknown" region markers into candidate 
components 58 is done by post-processing the instance tree 71 in combination with the token 
database built by the lexer 60. 

Lr.terface' serUl? 170- W 

Lp- address^ 192.168,1.1' 235.255.255.252'' 

25 bandwicJlh'-^ l.53<*^^ \n" 

Table 1: Example native syntax, tokenized with token ID's (third line is "unknown " syntax) 

[0100] During the initial parsing run, as discussed above, the lexer 60 feeds tokens to the parser 62. 
The l^er 60 also builds a table of tok^ with unique identifiers (IDs) which indicate the order in 
which tokens were found in the original input t©ct (Table 1). The parser 62 records the unique ID 
30 for each token in the instance tree 7 1 with each token making vip a matched rule (Table 2). Token 
IDs are used during post-processing to isolate and identify the contents of unknown syntax regions 
which become candidate components 58. When an "unknown" syntax region 74 is encountered 
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tixrough fhe parse enror handler, it is maiked with Ifae token ID of the token where parsing failed to 
reduce a grammar rule (in the Table 1 example, this is token #10) 



Tofesn UaadFlaq 
interface 

2 serial 

3 1/C 
n \r. 
J> ip 

■? 192.1C«.1.1 

S \ft 

IC cdrrier-delay 

11 nsec 

12 3CC 
U \r. 

14 bandwidth 

15 1534 



Table 2: Lexer toketi database 



5 lOlOl] Turning to FIG. 10, after initial parsing, during post-processing (step 670 of FIG. 6), the 
output instance tree 71 is walked in order to resolve the extent and contents of each unknown 
marker 74, at stq) 1010, Each instance object 72 in the tree contains a sequence of tokens which 
make up the syntax of the component instance. For each token referred to within an instance 72, at 
stq) 1020 we look iq) the token ID within the lexer's internal database and mark it as used (Table 

10 3). Thus, at tibie conchision of the instance tree traversal, each token makmg up the syntax of 
known components 54 is marked within the lexer database. 



ID Token UsftdFlaq 

IT interface x 

2 serial x 

3 1/C X 

4 \n X 

5 ip K 

6 tiddresfl x 

I 192.166.1.1 X 
S 255.255*255.252 x 
» \n X 
10 carrier-deiay 

II nsec 

12 300 

13 \n 

14 bandwidth x 

15 1534 X 

16 Nn X 



Table 3: Lexer token database, post-processed to mark tokens used in recognized components 

(0102] As a result, any tokens that remain unmarked within the lexer database are necessarily part 
IS of syntax regions which do not match anything in the component database 28. Furthermore, each 
unknown syntax region 74 is associated with some contextual information concerning association 
with other components (depending upon the nature of the vmdor conJBguration language), because 
of the position of the unknown region in ttie parsed instance tree 71 (See FIG. 9A). For example, 
referring to FIG. 9A, the •'unknown" sub-component 75 is known to be a sub-component of the 
20 known "interface" component 77, because the unknown region 74 is located in the parsed instance 
tree 71 between two known sub-components 78, 79 of the '^interface" component 
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[0103] At stqp 1030, we walk the instance tree again, sequentially post-processing each 
**unknown" region marker 74. The token ID listed in each unknown marker 74 is followed into the 
lexer's token database at step 1040. This token ID only maiks a spot within flie lexer database - 
we still know nothing about the extent of the region. To resolve the ext^ and contrats of the 

5 unknown region, we therefore walk the lexer token database in both directions from this starting 
point, until we encounter tokens that had been previously marked as *Hised" by existing instances at 
step 1050. The region between these boundaries is thus the contents of a new candidate component 
58 which replaces the unknown marker 74 in the output instance tree (Table 4). Finally, at step 
1060, the tokens which made up the unknown region 74 are thm marked as *^ised'* to prevent 

1 0 future passes through ttie lexer database from encountering the same unknown twice. 

Component: in:erfac« 
— Mono - *aatXAl 1/0" 

Address « •»a9^. 166.1 .1" 

Matimsisk - '*2S5.255.255.252'* 

Bandwidch * "^LW 

CaaponftAC: CandlddCe 

" t:ilinowK_3yntBx; '^carr tor-delay n««c JCO" 

Table 4: Post-processed instance tree fragtnent, with candidate component inserted in proper 
context 

[0104] The resulting candidate component 58 is not yet stored within the conq)onent database 28, 
1 5 but it is a fiilly featured component instance and is persisted as part of the final device 

configuration. At this point, the candidate component 58 is private, occurring only within a single 
device configuration. It cannot be reused or referred to by name. This allows new syntax to be 
tried within the context of a single device 10 without side effects on the network as a whole. 
Candidate components 58 become available for reuse when or if Grammar Builder 45 converts the 
20 candidate components 58 to components 26. 

101051 The method used to resolve unknown regions 74 into candidate components 58 is "greedy" 
in the sense that adjacent unknown regions are coll^^sed into a single candidate. Iliis effect occurs 
because we walk the lexer database in both directions from the initial token ID v/tnoh serves as a 
pointer into the database. The first pass through a given unknown region 74 thus consumes all of 
25 the unmarked tokens found, associating them with the first candidate created in a given region. 
When fiirther unknown markers in an adjacent set are post-processed, no imused tok^ are found 
in the lexer database upon de-referencing their pointer ID*s. To simplify the implementation, 
empty instances are created in such situations, which are compressed out of the final instance tree 
during the late stages of post-processing as discussed above. 
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{0106] Once the candidate components S8 have been identified using the methods discussed 
above, these candidate components 58 are th^ resolved into foimal grammars 70 by the Grammar 
Builder 4S system. Automated grammar production is a three phase process as shown in FIG. 1 1 . 
In the first phase at step 1 1 1 0, a mechanism is used to acquire a syntax tree for a set of commands. 
S As an example, within the command inter&ce on Gsco lOS devices, the command interpreter 
provides *tab completion" and limited help or prompting for commands. This facility is sufficient 
to discover the syntax of imknown commands. Any other method of acquiring these data is also 
acceptable, so long as the resulting information is accurate (for exanqple, it is also possible to parse 
electronic documentation.) 

10 [0107] In the second phase at step 1 120, the syntax tree is transformed into a usable component 
tree according to a set of algorithms which is partially vendor-neutral, but also including specific 
transformations appropriate to a specific command language. In the third and final phase at step 
1 130, vendor-specific mechanisms are used to examine semantic, rather than syntactic, issues with 
how commands are added and removed from a device, and any "side-eflfects" they may have 

1 5 within the configuration. Semantic information about component interaction is added to the 
grammar in the form of *tags" which can be used by other applications in addition to the Device 
Learning System 44. The resulting grammar is then "conqjlete" apart from any adjustments that 
are made to the quality of labels (since automatically generated labels are often fairly difficidt for 
human users to comprehend). 

20 [01081 Turning to FIG. 12, acquisition of an abstract syntax tree (Phase J) begins with identifying a 
set of "command roots" - e.g. the first several tokens that make up a command in a line-oriented 
configuration language. Roots serve to define the starting point and "breadth" of a syntax tree 
search. As an example, we will use the command ^'ip route 10.0.0.0 255.0.0.0 192.168.1.100 10", 
which adds a static route to the IP netwoik 10.0.0.0 via the network link located at 192.168.1.100, 

25 with a "distance" (or preference, essentially) of 10. The "command roof we will start with is the 
partial command "ip route," and our goal will be to automatically discover the syntax of **ip route" 
commands in lOS 12.2. Starting points are arbitrary - we could equally begin, for exanq)le, with 
the command root 'Ip" and discov^ all commands that follow this root. When Grammar Builder 
45 is activated within the system, the "command roof will be provided by the candidate 

30 compon^t 58 being resolved into a resolved granraiar 70. 

[0109] In order to discover the allowable syntax, at step 1 21 0 an embodiment of the invention uses 
a vendor-specific algorithm to 'Valk" any command completion or command-fine help available, 
and view the options available at each point in the command structure. This algorithm is 
generically called "WalkerViewer." In an ^bodiment disclosed below, the WalkerViewer 
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algoiittm for Cisco's lOS operating system is discussed in detail. Other operating systems and 
other vendors are also siqjported by modified version of the WalkerViewer algoiithm. For 
example, the WalkerViewer algorithm for Cisco's Catalyst operating system differs only in minor 
details. 

5 lOllOl The WalkerViewer algorithm for Cisco lOS begins at step 1230 by entering "configuration 
mode" on a network device (often a device in a test or lab netwoik setting). Within configuration 
mode, the operating system provides "command line completion" of partial conomands. When an 
incomplete command is entered, possible "next completions" are available by pressing "?" at the 
end of the partial command fi^gment. By adding each possible completion to the current 

1 0 command root recursively, all of the possible command options available below a starting root are 
discovered, at step 1240. Termination occurs in a given '^branch" whenever we encounter a 
carriage-return <cr> or "end of Une" (EOL) character. 

[01111 A partial example of the syntax tree 80 obtained by running WalkerViewer against the 
command root "ip route" is shown in FIG. 13. For example, when WalkerViewer tests the 
1 5 command root "ip route" on a network device running lOS 12.2, the fi)llowing options are 

presented as appropriate in the next position within the command syntax: '^profile" 81, '^vrf 83, 
and "AB.C.D" 85. The first two are command tokens which trigger fiarther options (and are 
irrelevant for this example). The third is a placeholder for an IP address variable - in this case, the 
destination prefix (as explained in the accompanying text). 

20 [0112J Following the destination prefix fijrflier into the tree, we then add a proxy value for 
"A.B.C.D" to the end of "ip route" to form our next command root, and query for possible 
completions. This action results in a single response: "A.B.CX) destination netmasl^' 87. This is 
added to the partial command string, filling in '"proxy" values for any variables, and tiie process is 
repeated. 

25 [0113) At each stage in the recursive process, the set of options are added to a syntax tree as a 

series of SyntaxNodes. Each token or variable which can follow a partial command is entered as a 
child of the preceding token. Each SyntaxNode records the specific token, its data type (e.g., IP 
address, word, mteger), and other specific metadata about the node. The example shown in FIG. 
13 shows the tree as if we follow the example command given above: "ip route 10.0.0.0 255.0.0.0 

30 192.168.1.100 10" with a carriage-return (<cr>) terminating the command. The fiill tree, depicting 
every possible continuation at each child location, is many times larger. 

[0114] As mentioned above, WalkerViewer will proceed down the syntax tree 80 obtained by 
command completion until every attempt to delve deep^ is terminated by an end-of-line character. 
In some cases, however, conmiands can admit options in any number of locations within a single 
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line of syntax, which creates repeating "loops" of options as you query conimand completion. Hie 
'1p route" exanq)le displays a simple example of this looping behavior. The options "name" 82, 
'^ennanenf ' 84, '*tag" 86, and the distance metric 88 can occur in any order, which causes each to 
present the others as possible completions (i.e. as children of each other on the syntax tree 80). 

5 [0115] WaScerViewer handles situation-specific processing of such occurroices, for example by 
accepting a plug-in designed to handle specialized processing needs such as loop processing. One 
plug-in recognizes the situation noted in the previous paragr^h: a set of options which can occur in 
any order and may or may not be present. In such cases^ the plug-in marks the parent in the tree as 
possessing "children which are allowed to contain loops." This metadata is used below to correctly 

1 0 post-process loops within the syntax tree 80. 

[0116] At step 1250, the output from WalkerViewer is **raw data" for all subsequent steps in 
processing, and may be a very large and poorly stractured tree, from the perspective of human 
* comprehension. It contains a combinatorial set of all the options available for a given command 
root in all of the orders possible. Additionally, with the entire syntax tree 80 it isnH immediately 
1 5 obvious where human-perceivable "commands" begin and end. Thus it is helpful to post-process 
the tree into a well-structured grammar. 

[0117] In the second phase (Phase U), the raw syntax tree 80 is post-processed into a grammar 
which includes structure recognizable to a human network engineer. In the following text, we first 
introduce the details of grammar construction within an embodimrat of the invention, and then 
20 cover the four steps involved in post-processing SyntaxNodes into a finished grammar according to 
an embodiment of the invention. 

[0118] In addition to the purely technical requirement that we generate grammars which are 
capable of parsing each vendor's commands, an embodiment of the invention also uses the 
grammar as the basis for constracting many user interface elements. Thus, in an embodiment, we 
25 generate formal grammars which are not only correct but structured in a human-understandable 
maimer. Since grammars help auto-generate user interfaces 40 such as configuration editors or 
tree-based views of a device configuration, the granunars of an embodiment of the invention have 
the. following conditions: 

[0119] Grammars should be as small as possible, consistent with the need for correctness. Small 
30 grammar size is helpfiil not only for user comprehension, but overall application performance. 

[0120] Grammars should be as "shallow" as possible. In other words, if the syntax of commands 
is rq)resented as a tree, the user should only have to "drill down" the minimum number of levels 
possible to discover an option or command they are seeking. 
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[0121] C^:anmiais should reuse as many common constructs as possible. We should avoid 
duplication whenever possible. 

(0122) The algorithms take a tree of SyntaxNodes as output from WalkerViewer, and produce a 
mathematically correct grammar, and then post-process this grammar into a form which attempts to 

5 meet these conditions. 

(0123) The grammars used in an embodiment of the invention are composed of the following 
constructs: 

[0124] LITERAL - a token that appears veibatim in the command bemg matched. Groups of 
Uterals are the k^ to disambiguating the differ^t ''commands" within a vendor's configuration 
10 language. 

(0125) ATTRIBUTE - a token that can have a range of values, often constrained by "type" but 
supplied as data by the user. Types are defined by the language itself and the grammar author. 
Examples include types such as integers, IP address, word, phrase with embedded white-space, and 
soon. 

1 5 [0126] LIST - an ordered set of any of the grammar constmcts in the curr^t Ust. A list such as ''A 
B" has the granMnatical meaning of "an object of type A followed by an object of type B." 

[0127] OR-BLOCK - an unordered set of any of the grammar constructs in the current list. An or- 
block such as "A | B" has the grammatical meaning of "a token of type A or a token of type B may 
occur in this positioa" 

20 10128) SECTION -a named grouping of grammar constructs. Sections are a "convenience" in a 
pure sense, simply allowing grammars to 'Veuse" groups of elements by reference. This keeps the 
size of the grammar rule-base small, which is a concem when creating program code to implement 
the parser. 

[0129] Certain kinds of sections are also used to maik the boundaries of human-understandable 
25 concepts. For example, the final grammar for all of the commands in Cisco's lOS is one giant tree. 
In order to mark segments of the tree which represent the rules for what users would recognize as 
components, or individual lOS commands, we mark certain sections as *TINAL." In a similar 
way, we also mark some sections as CONTAINER" sections, to organize the tree of commands 
into a hierarchy which mirrors common networking concq)ts, mstead of presenting a flat, 
30 unorganized set of commands. These markers are not tme parts of the grammar fix>m a parsing 
perspective, but instead are grammatical metadata used by an embodiment to construct the user 
interface. 
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10130) In a similar way, cextain positions in a connnand may admit to a constrained set of literals 
or attributes. These positions can be marked as an ENUMERATION, which really are an OR- 
BLOCK of allowable LITERALS or ATTRIBUTES permitted in a given position. Again, 
enumerations are not necessary for parsing or compiling, but instead represent a grammatical 
5 optimization used in constmcting a user inter&ce which is intuitive and comprehensible without 
knowledge of system internals. 

10131] As a furttier simpKfication, attributes or literals which can take only two values ("on" or 
"off 0 are transformed into BOOLEAN constructs. These do not aflfect parsing at all, but instead 
are used as an optimization in constmcting the user interface. 

1 0 [0132] The second phase in grammar production takes the raw tree of SyntaxNodes developed by 
WalkerViewer, and produce a set of grammar rules. In our simple example, the end result is going 
to be a well-structured grammar which can parse commands begiiming with "q) route" and 
containing numerous P addresses and other data values. 

[0133] The result of processing of the syntax tree 80 is shown m Table 6 below. In contrast to the 
1 5 syntax tree 80, note that there is only one **path" throu^ the granranar rule, as defined by end«^f- 
line characters. Multiple possible paths through the tree of SyntaxNodes have be^ coHqssed into 
alternate or optional sections. The top-level rule is marked as ^TINAL^" denoting a granmiar 
section which corre^nds to a single user-perceptible command - in this case, our "ip route. . ." 
example. 

20 [0134] In the text below we discuss the algoiilhms for post-processmg the syntax tree 80 into 
Table 5 below. 
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ATTRIBIJrE(ipaddress, **prefix*') 

ATTRIBUTE(forward_mask, ^^netmask**) 

CiscoIOSStaticRoiiteDestiDation 

[ CiscoIOSStaticRouteJOistance ] 

[ CiscoIOSStaticRoute JTag ] 

[ QscoIOSStaticRouteJPenDanent ] 

[ CiscoIOSStaticRouteNaiiie ] 

EOL; 

CiscoIOSStaticRouteDestination: 

DestinationJIop | DestmatiaD_lDtBi:&ce; 
DestinationHqp: 

ATTRIBlJTE(ipaddress, "Gateway") ; 
Dcstmation lnterface : 

ATTRIBUTE(woid, *1nterfece Name"); 
QscoIOSStaticRouteJDistance: 

ATTRIBUTE(iQteger, 'T)istance MetiiO; 
CiscoIOSStaticRoutejrag: 

ATTRIBUTE(word, *Tag"); 
CiscoIOSStaticRoute_Naine: 

ATTRIBUTE(word, *TS[ame"); 
CiscoIOSStaticRoute__Pennanent: 

BOOLEANCTermanenf **peinianenf '"Oi 



Table 5: Post-processed component reflecting the *'ip route" command with options. 

[0135) Turning to FIG. 14, the first step 1410 is to simply transform SyntaxNodes into their 
equivalent grammar constructs througji equivalences between SyntaxNode types and grammar 
constructs. Some ofthe translation rules used include: 

30 [0136) SyntaxNode literal -> grammar LITERAL 

10137] SyntaxNode data value (any type) -> grammar ATTRIBUTE ofthe coiresponding type 

[0138] SyntaxNode terminal -> grammar EOL (a special token denoting a vendor-specific end-of- 
line character) 

[0139] Each level in the tree of SyntaxNodes is a set of "siblings " which are translated into an 
35 OR-BLOCK. 

[0140] Ancestor/descendant lines through levels in the syntax tree fonn a LIST. 
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{0141] Transform any nodes maiiced as "children allowed to loop" in the previous phase into 
grammar sections which are allowed multiple times ('"multiply allowed"). 

[0142] The output grammar has no sections^ has repeated literals and elements in many places, and 
many EOL terminations, but is technically enable of parsing device iiiput The results will not 
5 mean much to a human observer, but the grammar is mathematically conect 

[0143) At step 1420, the resulting grammar is transformed to remove unnecessary EOL 
terminations, and guarantee that each command follows a single path to a single EOL termination. 
Multiple terminations occur because most commands have sets of options which can be used in 
different combinations to form a valid command. 

1 0 [01441 As an example, consider our simple "ip route" conmaand above. The **tag" 86, **name" 82, 
'"permanent" 84 and distance metrics 88 can be used alone, or in any combination (and in any 
order). Thus, the grammar generated by transforming the syntax tree 80 contains multiple paths - 
one for each ordering and combination of options. We restructure the grammar to "compact** it and 
produce a simple, clean stmcture. This is done by examining parts of the granunar where OR- 

1 5 BLOCKS and LISTs contain common elements and re-arranging them into a new set of lists and 
or-blocks which lead to a single terminal EOL marker. 

[0145] Because tfie grammar for a command root is automatically generated, it may have a 
number of stmctural anomalies which don't affect parsing but are strange to the human user. At 
step 1430 of this phase, the grammar is restmctured to eliminate common anomalies. Below is a 
20 list of the some example restructurings. Additionally, each new vendor has the potential to expand 
the list of desired restructuring transformations. The plug-in architecture of an embodiment of the 
invention makes it easy to expand the list of restructuring transformations to handle changes in 
vendor-specific situations. 

(014Q Common endings that are repeated in OR-BLOCKs are "factored** out. For example, if the 
25 grammar contains the following: 

axy |bxy |cxy 

where a,b,c,x,y are literals or other grammar constructs, we refactor the grammar as follows: 
(a|b|c)xy 

[0147] Another common transformation is to find common sub-expressions in nested portions of 
30 the grammar, and "flatten" them into a single list For example, if the grammar contains the 
foUowmg: 

(a|b|c|d(a|b|c)) 

25 
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Where a,b,c,d are literals or other grammar constructs, we re&ctor the grammar as follows: 

(a|b|c|d).,. 

[0148] In this refactoring, the . designation indicates a grammar construct which is allowed to 
occur multiple times. These representations are not precisely equivalent mathanatically, but for 
5 practical purposes the resulting grammar coirectly parses configuration text and is far more 
understandable to users. 

[0149] Automatic generation may also leave optional elements "orphaned" within an OR- 
BLOCK. These elements can be "flattened" into a simpler OR-BLOCK: 

{ a ] I [ b ] I [ c ] is refactored into a | b | c 

1 0 [0150] Finally, we make a full combinatorial search of the grammar to find segments which are 
repeated. These segments are put into sections (as defined above), and their occurrence replaced 
by reference to the new sectioiL This drastically reduces the overall size of the grammar by 
removing redundancy, at some cost to complexity. 

[0151] After the previous step, the grammar is now well structured and nearly ready for use. The 
1 5 grammar, however, is a single tree, with no *'boundaries" which denote where conmands begin 
and mi. This is mathematically unnecessary for parsing, but crucial for presenting &e results of 
parsing to the human user. Thus, in the final step of Phase II, at step 1440, we insert "section" 
constructs within the grammar with FINAL markers wherever comonand boundaries occur. Each 
command, complete witii its suite of options and attributes, is therefore turned into a component 26. 
20 These components 26 are flien able to be stored in the component database 28, as fiilly-functional 
and shareable components, and can be reused in policies 34 as desired 

[0152] The algorithms for maddng command boundaries may differ between vendors and 
command languages, necessitating a modular architecture (as with other steps in the Device 
Learning System). In this section, the algorithm for Cisco's lOS and CatalystOS command 
25 languages is described by way of example. This algorithm will also work for any command 
language which has a rigorous form of command negation and command completion on the 
conimand line. Other embodiments for other vendors and command languages are also possible. 

[0153] Network device command languages typically have **positive" and '^negative" forms for 
each piece of functionality. '"Positive" command forms typically activate a piece of functionality, 
30 whereas •"negative" command fomais de-activate or '"ranove" a piece of fimctionaUty from the 
ruruiing network device. This feature helps provide one way of locating command boundaries, 
according to an embodiment of the inventioa Thus, by comparing the grammars g^erated for 
both positive and negative forms of conmiands, we can locate the nodes within the grammar which 
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Tepn'es^t me stan ot a command. This reUes upon fhe fact that the two trees will differ only by Ihe 
syntax required in a given command language for negation. For e>canq)le, in Cisco lOS and other 
command languages, commands are usually removed by pre-pending ' W to the beginning of flie 
command. 

5 [0154] With reference to FIG. 15, atstep ISlOwe begin with thepositivegrammarcreatedusing 
the methods discussed above. Usmg the methods discussed above, we also create a negative 
grammar by generating a WalkerViewer syntax tree using the same command root with * W 
prepended, at step 1 520. The remaining stages of grammar production are identical. The result is 
two grammars - one which applies functionality ('*positive") and one which removes functionality 
10 ("negative"). 

[0155] Thus, to find and maik all of the "command boundaries" within the grammar, we walk 
throu^ the '"positive" grammar node by node, at step 1530. At each node, we look at the 
"negative" grammar to determine if a terminal end-of-line character is in an equivalent position as 
in the '^positive" side, at step 1540. If it is, we have found the boundary of a "command" and can 
1 5 mark the area of the tree traversed as a FINAL section (flie grammar equivalent to a componmt), at 
step 1550. 

[0156] The final stq) is to merge information fi:om the ^'negative" grammar tree into the positive 
tree, in order to create a single grammar which has tiie ability to generate both activation and de- 
activation (positive^egative) forms of each command. This is done by transplanting fhe negative 
20 form of the command into a sub-section of each FINAL section, marking fhe negative form as a 
REMOVAL, at step 1 560. This tag allows the compiler to generate either a positive or negative 
form of a command, depending upon whether the user's action was to add a component to a 
network device, or remove it. 

[0157] Up to this point, the grammars produced operate und^ the assumption that conq)onents 26 
25 are "orthogonal" to each other - in other words, that each can be applied and removed without 
affecting any other portion of the configuration 50, Sadly, on many platforms, this assumption is 
unwarranted. The addition or removal of a command firom a device configuration 50 will often 
trigger various "non-local" changes in other aspects of the configuration. In order to fully 
understand the effect tibiat addition and removal has upon a network device 10, the system catalogs 
30 these effects during grammar production. In the sections that follow, these effects are referred to 
generally as the command-level "semantics" of commands (and by extension, components 26). 

[0158] One example of a command-level semantic effect is the necessity of understanding removal 
of a component 26. Removal semantics are actually recorded in an earlier step in grammar 
production because we also "sectionize" the grammar into components 26 at the same time. 

27 
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Couunand-level semantics can vary by platforni^ but some common effects seen on netwoik 
devices today include: 

• Commands may have "de&ult" values for some attributes. On some platforms, when 
die command is issued with default values, it does not appear in the running 
configuration of the device. Vfbsa the attributes are changed to non-default values, 
the command then "appears" in tiie configuratioa This is largely a cosmetic issue 
rath^ than a real difference in device ftmctionality, but bom the perspective of 
configuration management, it is a significant semantic effect 

® Command attributes may have multiple equivalent formats. For example, port 
numbers in Cisco lOS frequentiy have textual equivalents (e.g., port 80 can also be 
entered as 'Svww"). This is also a cosmetic issue fi-om the perspective of device 
fimctionaHty, but from the perspective of configuration managemmt, it too is a 
significant semantic effect, and it is desirable to be able to handle a device that 
silently transfonns one format to another. 

• Commands vary in their "duplication" behavior. In other words, some commands 
must be absolxitely unique, whereas other commands may be allowed multiple times 
in a configuration with different values for key attributes. Given the command 
syntax alone, we cannot know tiiis, but in order to guarantee proper configuration 
behavior, this is important information to discover during grammar production. 

• Commands are often *'tied" together in non-obvious ways. Adding one command 
may cause others to appear in the configuration (often at their default value), wh^peas 
removing a command may cause other commands to disappear from the 
configuration. 

[0159] These are examples of command semantics that appear on a number of platforms across the 
Cisco product line. Platforms fiom other vendors, for example, include other forms of semantic 
effects. Fortunately, the plug-in nature of an embodiment of ttie invention allows us to expand the 
set of algorithms for semantic discovery m a seamless and natural way. 

[0160] Examples of the semantic effects we catalog are as follows: 

o Different representations for the same data value (port 80 = *Vww"). 

e Default values for attributes which cause tiie command to disappear from the native 
configuration when entered. 

• Dependencies between components upon addition/removal. 
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• Parameters which make conqponents unique amongst other instances ofUbe same 
component 

• The set of parameters which are important in determining whether a component was 
properly added or removed from a device during download testing 

5 [0161] Each method for command-level semantic discovery is based on the general method of 
FIG. 16, with specific details for each type of semantics to be discovered: 

{0162] The method begins at step 1610, by selecting a network device with the ^ropriate 
operating system and version characteristics. At step 1620, the running conjSguration of the device 
is retrieved. At step 1 630, the configuration is perturbed according to the type of semantic data 
1 0 we*rB gathering. At stq) 1 640, the running configuration is retrieved again. At step 1 650, the 

differences between the **pre" and **post" change configumticms are detemoined. At step 1660, the 
differences are processed according to &e specific type of semantics being investigated At step 
1670, the prior steps are repeated, with various combinations tried in order to discover tiie data 
needed. 

15 [0163] Turning to FIG. 1 7, discovering default values and equivalracies, and otherwise auditing 
configuration behavior is a combinatorial process. Starting with a component (abbreviated as "C" 
in the algorithm below), and a suitable test device, the method proceeds: 

1 . Determine the list of possible attributes for component C. This list will be denoted as 
POSSIBLE^ATTR. (stq> 1710) 

20 2. Start with a copy of POSSIBLE_ATTR that denotes the list of attributes which are 

usefiil to determining the success of a download onto a network device. This list is 
called the "audit attributes" list, and is scrutimzed \spon download to determine 
whether a component shows up properly after download. This list will be denoted 
AUDrr^ATTR. (step 1720) 

25 3. For each attribute (ATTR) in POSSIBLE^ATTR, do the following (step 1730): 

a Retrieve the configuration of the network device.(step 1740) 

b. Deteraiine the list ofpossiblevahies that ATTR can take, (step 1750) If the 
attribute is of a type with a short "known" list of values, test each of them. Jf 
the attribute is of a type with a large (or infinite) list of values, apply a 
30 sampling strategy to test a list of statistically representative values. This hst 

of possible values will be denoted as ATTR_VALUES. In some cases, it 
may be necessary to exhaustively test a range of attribute values in order to 
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find the default values. As an alternative, default values can be supplied by a 
human user or "discovered" by parsing the vendor's documentation if 
available in machine-readable fomiat Also, it may be possible on some 
platforms to query for default vahies using the command-line interface: 
Cisco's Catalyst OS platform allows this, which would obviate the need for 
exhaustive search, although the procedure is still needed to discover 
auditing/post-download behavior and attribute equivalencies. 

c. For each value (VAL) in ATTR_VALUES, do the following (step 1760): 

i. Set the value ofATTR in componoitC of the configuration to VAL. 
(step 1765) 

ii. Apply the configuration to the device, (step 1 770) 

iii. Retrieve the new running configuration of the device, and parse this 
configuration into a component tree, (step 1775) 

, iv. If ATTR shows iq) in component C as VAL, move onto the next 
value (since this is not the "defaulf ' value for VAL). (step 1780) 

V. If ATTR doesn't show up in component C as VAU but other values 
of ATTR (fipom prior or subsequent iterations of the algorithm) do 
show \sp as their corresponding values (step 1785), then VAL is the 
"default" value, and is marked as such in the graromar for component 
C (step 1787). 

vi. If ATTR doesn't show up as VAL, and no other values do either, 
remove ATTR fi-om the list of AUDIT_ATTR, since it cannot be 
reliably tested as part of the post-download test comparisoa (step 
1790) 

vii If ATTR shows up, but in a differmt format (or even data type) (step 
1793), add it to a list of "equivalency" m^pings for the component 
For example, port numbers will often be assigned as numbers (e.g., 
80), but will be returned or displayed in textual equivalents (e.g., 
WWW), (step 1795) 

[01641 The information gathered in the method of FIG. 17 is principally used within regular 
configuration audits and post-download configuration comparisons to ensure accurate tracking of 
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errors. In other words, "dis^peaiing defaults" or automatic translation of values which have 
equivalency m^ings should not be reported as errors. 

10165] Turning to FIG. 18, discovering which attributes require unique values in order to create a 
unique "instance" of a component (and thereby, a unique conraiand) is a combinatorial process. 
We start with a component 26 , and determine what combinations of its attributes* values can co- 
occur on a device 10. The result of this may be that (a) the component 26 is only allowed once on 
a device 10, or (b) some combination of values of some attributes defines a "unique" instance of 
the component 26 , where multiple unique instances can exist on a device 10. If we begin with a 
component (abbreviated as "C" in the algorithm below), the process of determining uniqueness is 
as follows: 

1 . Start with an empty list, to which we will append the names of attributes that daiote a 
unique instance of component C. This list will be denoted as UNIQUE^ATTR. (step 
1810) 

2. Determine die complete list ofpossible attributes available for coniponentC. This 
list will be denoted as P0SS1BLE_ATTR. (step 1820) 

3. Retrieve the ruiming configuration of a test device, (step 1830) 

4. Determine if the cotifiguration already has an iiistance of component C present, (step 
1840) If not: 

a. Add an instance of component C, with randomly chosen attribute values (of 
appropriate types), to the device configuration, (step 1843) 

b. Apply the configuration to the device, (step 1 847) 

5. For each attribute (ATTR) in POSSBLEjVTTR, do the following (stq) 1850): 

a. Determine Ihe list of possible vahies for ATTR. (step 1853) If the attribute is 
a type with a short '"known" list of values, test each of them. If the attribute is 
a type with a large (or infinite) list of values, generate a sample of possible 
values fliat is statistically representative. If the attribute is a type with a 
constrained range of vahies (discovered by WalkerViewa: during Phase 1), 
select a sample of values fiom within this range. This list of possible values 
will be denoted as ATTR^VALXJES. 

b. For each value (VAL) in ATTR^VALUES, do the foUoAving (step 1855): 



31 



wo 2004/090672 PCTAIS2004/010424 

i. Add a second instance of component C to the device configuration, 
with VAL for the vahie of ATTR and otherwise identical to the 
existing instance of component C. (step 1 857) 

ii. Apply the configuration to the device, (step 1 861) 

5 iii. Retrieve the new running configuration of the device, (stq) 1 863) 

iv. Determine how many instances of con^nent C exist in the new 
configuration (stq) 186S): 

1 . If only one instance of component C exists, we know that 
VAL is not one of the values (if any) for ATTR that makes an 

1 0 instance of the component C unique, (step 1 867) 

2. If there are two instances of component C present, we know 
that VAL is one of the values for ATTR that makes an 
instance ofcon^onentC a unique instance. Add 
ATTR(V AL) to UNIQUE>TTR. 

15 6. At the end of&is combinatorial search, UMQUEj\Tm will conta^ 

attributes and values for those attributes which cause an instance of component C to 
be unique. This list is added to the grammar as an aimotafion to component C. If 
UNIQUE^ATTR is empty when the mefliod finishes, flus means that only one 
mstance of the component can be applied to a device at any one time. Add a notation 

20 to UNIQIIE_ATTR that records that component C can only occur once m a 

configuration, (step 1870) 

[0166] The information gathered on component duplication can be used a number of ways in an 
embodiment of the invention. For example, a user could be prevented firom adding two instances 
of a component which can only occur once - instead, the existing instance could be replaced by a 
25 new instance. 

[0167) In an embodiment, duplication information may be used in a number of ways to ^sure a 
"consistent and correct final configuration for each netwodc device: 

1 . Every time a policy-driven configuration is compiled into a **native" configuration, 
the list of components is traversed and any duplicated components are evaluated 
30 according to their UNIQUE_ATTR list. Duplicates are disallowed if the component 

must be singly present, and if multiple mstances are allowed, each instance is 
checked for uniqueness. 
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2. Whenever a coDOponent is inserted into a policy-driven (PD) configuration, the 
component is checked to determine if it must be singly present Since data may not 
have been fiUed in for attributes at this point, we cannot yet check for uniqueness of 
multiple copies of the same component (this is checked at editing or compilation 
time). 

3. Whenever a policy is inserted into a PD configuration, the list of components 
contained within the policy is checked against the poUcy-driven configuration. Any 
components which are dupUcated between the poUcy and the PD configuration are 
evaluated. Components which must be singly present are deleted bom the PD 
configuration and the instance fi^om the poUcy is kept Components which may be 
multiply present are evaluated according to their UNIQUE_ATTR lists. If there are 
still **overlq>s," given the uniqueness criteria, the version fi'om the PD configuration 
is deleted and the version fi'om the policy is kept. Retaining the poUcy version and 
deleting the PD configuration version is an implementation-dependent decision, 
designed to place as much reliance on poUcies as possible (in order to give customers 
the leverage of keeping configurations factored into policies). In altemate 
embodiments, the components fix)m the PD configurations could be retained and 
those from the policies deleted, or other schemes could be unplemented as desired. 

4. Whenever the policy-driven configuration is edited, any change to an attribute value 
is examined to determine if it results in a duplicated component If an edit does result 
in a duplicate component, &e duplicates are evaluated according to their 
UNIQUE_ATTR lists, and if necessary, an error message can be presented to the 
user. 

[0168] Discovery of dependencies between components is also a combinatorial process. We begin 
with a component, and determine the effect of adding and removing that component (wdth realistic 
test attribute data) from a device configuration. The linkages discovered are then added to 
grammar as annotations linking two components. 

(0169] We begin with a component C, to discover its dependencies (if any): 

1 . Turning to FIG. 1 9, to determine dependencies upon addition: 

a. Retrieve the configuration of a test device, caching it for fixture use as PRE- 
CONFIG, (step 1910) 

b. Add a new component C to the configuration, filling in random values for 
required attributes (of the q)propriate data types), (step 1920) 
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c. Apply the altered configuiatioB to the device, (stq) 1930) 

d. Retrieve the new running configuration, caching it as POST-CONFIG. (step 
1940) 

e. Parse both cached configurations into a tree of components (step 1 950), and 
compare the two trees, producing a list of differences, (step 1960) 

f. The Ust of differences between PRE^COmO and POST-COm 
include component C. Search for, and record, any other components which 
newly appear in the configuration as a result of adding Ck)mponeQt C. Also, 
there may be components which "disappear" with the addition of component 
C. Add both to the list of "addition" dependaicies for component C. (stq> 
1970) 

2. Turning to FIG. 20, to detranine dq)endencies upon removal: 

a. Start with the POST-CONFIG, which contains component C. Cache this as 
the PRE-CONFIG for removal testing, (step 2010) 

b. Remove conqjonent C 6om the PRE-CONFIG. (step 2020) 

c. Apply this altered configuration to the device, (step 2030) 

d. Retrieve the new running configuration fix)m the device, caching it as the new 
POST-CONFIG. (step 2040) 

e. Parse both cached configurations mto a tree of components (stq) 2050), and 
compare the two trees, producing a hst of diflfer^ces. (step 2060) 

£ The list of differences between PRE-CONFIG and POST-CONFIG should 
include component C as a deletion. If no other differences are detected, then 
removal can be marked as fi^ee of dependencies. If not, the list of differences 
except component C itself constitute the removal dependencies of component 
C. Add any differences to the hst of ''removal" dependencies for component 
C. (step 2070) 

[0170) The algorithm discovers removal dependencies ttiat exist wiflim a vendor language. The 
algorithm will discover all newly dsppeasing dep^dent components in addition or removal 
dependencies, by definition, but it cannot discover a removed dependent component in either an 
addition or removal dependency unless the dependent component was present in the initial tested 
configuration. This can be remedied by, fi)r example, exhaustive combinatorial testing of 
components in an N x N matrix. Also, performing addition dependency checking first, before 

34 



wo 2004/090672 PCTAJS2004/010424 
perfonning removal checking, raises ttie likelihood significantly of seeing an accurate picture of 
removed components. 

[0171] The data on inter-component dependencies is encoded within the grammar as annotations 
on components 26. This data can be used in a number of ways within an embodiment of the 

5 invention. The dependencies can be used within the user's editing interface. When a component is 
added to a poUcy-driven configuration, dependent components can be added as well. Similarly, 
upon removal, dependent components can be removed if £5)propriate. The dependencies can 
simply be taken into account during post-download testmg of a configuration. For example, if we 
download a new native configuration which removes component A fi-om the device, we should 

10 expect the post-download running configuration to be missing component A and any components 
which have a removal dependency upon^^. Similarly, if we add component B to the device, we 
should expect the new running configuration to contain component B and any components (with 
attendant data) which have an addition dependency upon B, These can be used in tandem, or 
separately, within an embodiment of the invention. 

15 10172] The system of an embodiment also aUows for vo^on control of all editable data in the 
Systran. In our preferred embodiment, PDC 16, policies 34, device data storage 22, instances 20, 
and components 26 are version controlled within the system. In other words, the conq)onent 
database 28 and other data storage aitities in an embodimmt can be constructed so as to preserve 
the history of changes to each of the stored data items. Example embodiments might use files 

20 stored in a commonly available version control system (e.g., RCS, Microsoft Source Safe); 
altranatives also include storing objects in a SQL database with tables tracking changes to each 
object. Version control, however accomplished, allows detailed tracking of whoi and by whom 
each data element is changed, and reconstruction of exact content and structural changes to each 
entity. 

25 [0173] When a component or poHcy changes within the version-control database, the change 

implies that a number of devices (from zero to all of the devices in the inventory) may be affected 
and require new configuration download. The system tracks dependencies between device 
configurations and components/policies, such that changes can quickly be mapped to the set of 
devices requiring update. 

30 [0174] Users won't often have to directly edit the grammars which underUe conqwnents, but in 
those situations where necessary, the system provides for an editing mterface. The driving 
principle for syntax editing is ttiat users should not be exposed to formal grammar specifications 
such as BNF or ASN.l . Operating directty upon grammar specifications is not only unfamiHar to 
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most netwoik engmeers, but a highly error-pxone process wilfa broad ramifications tbroughout (he 
system. 

[0175] Therefore, an embodiment of the invention provides a mechanism for rendering a 
component 26 into a human-readable form which can be edited in a component editor 41 . The 
5 component is compiled by the DLS 44 into its native result, by filling in arbitrarily chosen data 
values for any attributes. The result can be provided to the user in an editor 41 , at which point any 
changes can be presented to the DLS 44 and Grammar Builder 45 for translation back into formal 
grammar 70 and modification of the componmt 26. The changed component 26 is then saved to 
the component database 28 for fiiture use. 

10 [0176] The changes are immediately available for fiiture use by fhe DLS because the changed 
component representation is new^ than the cached version of the component, which had 
previously been compiled into an executable parser. This change in the component therefore 
triggers the recon^ilation of the parsers. 

[01771 In an alternative embodiment, the executable parser is static and never needs recompilation, 
1 5 given a table-driven implementation where component changes merely xapddXe a table which is re- 
read at every invocation of the parser. This alternative embodiment simplifies the iniplementation 
considerably but does not materially change the results of operation. 

[0178] In general, thQ"e is a single grammar specification for a given vendor command language. 
In some cases, there may be separate grammar specifications (and thus separate components) for 
20 widely varying versions of a command language. There is considerable variation, however, in 
siq>port for features of a typical command language across product models, hardware 
configurations, and software versions. 

[0179] The system handles this kind of variation by allowing a component 26 to possess multiple 
alternative sets of syntax blocks 36, as shown in FIG. 21.. Each alternative set of syntax blocks is 

25 associated witii a conditional 37, which is a rule of the form '"if. . .then" which govems when that 
syntax block is used. During compilation of a native configuration ftom a PD configuration 16, the 
correct syntax block 36 is chosen by deterarining the most-specific match of device data 12 to the 
conditions specified in the conditional 37. Typical device data used in conditional rules are 
metadata such as hardware platform, software version, or physical hardware interface type. At 

30 least one syntax block 36 is designated as the "default" implementation, and is used to compile a 
native configuration whenever there are no matches between device data 12 and conditionals 37. 

[0180] Component variants can be created by the user, or by Grammar Builder 45 if necessary, 
and may be created at the desked degree of detail - in other words, it is possible to create a 
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component variant which applies to a single hardware platform, running a specific software 
version, using specific hardware card types and firmware versions. This kind of specificity is often 
needed simply to work around bugs and problems in network hardware vendor implemeatations. 

[0181] Turning to FIG. 22, native configurations are created by compilation using the device 
5 instance tree and the grammar rq)resented by each component's syntax. This compilation process 
may be performed by the Device Learning System 44, since as noted above, the DLS 44 uses the 
same lexer and parser to do analysis of native configurations as is used to compile PD 
configurations 16 into native configurations. During the compilation process, any outstanding data 
references are resolved, and the final configuration (whether fiiU or inaremental) is placed in a 
1 0 staging area (e.g. the device data structure 12) for retrieval and application to the physical device. 

10182) The compilation process begins at step 2210 by taking the fiiU grammar composed firom all 
of the stored components (within a given vendor language), and the stored instance tree for the PD 
configuration 16 for flie target device at step 2220. These form inputs to a recursive^escent parser 
which emits the target configuration language as its "object code." 

1 5 [0183] In flie preferred embodiment, the compiler is written as an LL(0) recursive-descent parser 
because it is difficult to plug an object tree into a standard oflF-the-shelf LALR parser (e.g., 
Bison/yacc) and emit a configuration. In general, the compilation process recursively walks the 
instance tree at step 2230, emitting literals fiom flie corresponding grammar specification m the 
case of Uteral instance data. When object data members are encountered, we track the usage of the 

20 reference and only emit syntax corresponding to flie data reference once. 

[0184] In flie case of lists, we recurse through the list, keeping a reference to the current position of 
the output buffer before each recursion step. If a recursion step results in an incomplete output 
string (i.e., an incomplete vendor command string), we retum to the caller and restore the output 
buffer pointer to the position it held prior to flie recursion. This prevents the output of incomplete 
25 command strings. 

[0185] Conq)ilation need not always generate a complete native configuration. In addition to 
compiling the fiiU configuration for a device, the system generates incranental configurations, 
which contain only fliose components and data which are changed since the last configuration 
event. Compilation can start fix)m any subtree wifliin the PD configuration 16, given an 
30 ^propriate granunar subtree that contains needed syntax. 

[0186] Incremental changes to physical devices minimize flie impact of changes to unrelated 
fimctions and in general increases network stability. Full configuration compilation is still usefiil, 
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however, for previewing the impact of changes, testing in laboratory contexts, and for re-generating 
a physical device in the event of hardware failure. 

10187] Updating physical devices with new configurations is another significant aspect of network 
control. The exact means used to apply configuration changes varies according to the type of 
5 device, and is determined by the features provided by the device vendor. 

10188] Turning to FIG. 23, a system in accordance with an embodiment of the invention wraps the 
device vendor's own mechanisms with a process designed to provide safety for the actual change 
execution. We treat q)pUcationofnative configurations to a device as a three-stage process. At 
step 23 10, precondition checldng is designed to prevent any changes firom proceeding in an 
1 0 environment which is not well-controlled, and is equivalent to an airline pilot's **pre-flight check." 

[0189] If this check succeeds, at step 2320 we use a soflware driver which implemoits a vaidor- 
specific process for q)plying configuration changes to the device. If errors occur in the ^phcation 
process, the configuration is rolled back to the previously running configuration and the problem 
reported to flie task owner. If no errors occur, we than perform postcondition checks at step 2330, 
15 to verify that the device is being left in a fimctional state. 

[0190] To be more specific, the download process operates according to a strict contract, modeled 
after interface contracts in object and component-oriented programming. Changes cannot occur if 
preconditions are not established, or the results of a change cannot be predicted. And a change 
cannot be considered complete until postconditions are established. 

20 [01911 Precondition checks are performed prior to any software-mediated change to a device 

configuration. In order for a change to be "safe" to spply, the physical network device must be in a 
known and predictable state. For example, if the device firmware or operating systan version has 
changed since the last configuration was compiled, our software cannot guarantee that the new 
configuration will work m a predictable manner. 

25 [0192] Thus, precondition checking seeks to establish the "enviroimiental state" of a netwo± 
device prior to applying a configuration change. We attempt to establish the invariance of: 

• Hardware platfomi 

• Installed hardware if a modular architecture 

• Operating system soflware or firmware version 
30 o Running configuration version 

[01931 If any of these environmental factors is dififerent than expected, we do not allow the 
configuration change or download to proceed. Additionally, we may collect fimctional data about 
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a aevice, to aid in the process of assessing prop^ operation following application of the pending 
change. 

[0194] Postcondition checking is diflFerent than the precondition test in that we seek to detennine 
whether the network device is functioning properly, following ^plication of a configuration 
5 change. When making postcondition checks, we typically seek to establish that interfeces 

designated as "up" are passing traffic, establish that the routing table differs in 'predictable" ways, 
and so on. 

[0195] Much of the postcondition flinctionality checking is implemented as pluggable modules in 
a scripting language. This allows the system to respond quickly in the face of changes to vendor 
1 0 commands or command output formats, and to allow professional services or third-parties to easily 
ext^d this phase of testing. 

[019q A complete system for controlling network devices should ensure that network devices 
contmue to implement appropriate policies, even after the configuration is applied and tested. 
Engineers might change the device configuration manually in the course of troubleshooting or 
1 5 maintenance. More rarely, unauthorized changes can be made, either by employees or by intrusion 
from outside the customer organization. 

[0197) The system performs periodic auditing of all production network devices, to detect changes 
to a network device which did not originate within the change control process of an embodiment of 
the invention. The auditing interval is configurable, and often will be performed at least once per 
20 day (if not more often). 

[01981 Turning to FIG. 24, basic periodic auditing has five steps. At step 2410, the running 
configuration for a device 10 is retrieved. The most recent version of flie PD configuration 16 is 
retrieved fix)m the component database 28 in step 2420. This version of the PD configuration 16 is 
what the system believes should be present on the device, absent any changes from outside sources. 

25 At step 2430, the running configuration is passed to the DLS 44 and parsed into a PD configuration 
16 as well. Next, in step 2440, the two PD configurations 16 are conq)ared by walking the two 
trees of instances 20 in tandem, noting any components 26 or policies 34 that occur in one PD 
configuration but not the other. If the resulting list of missing components 26 or policies 34 is 
empty, then the network device 10 has not be altered since tihe last time the system performed a 

30 change to its configuration. Oflierwise, the list of missing components or policies is stored as the 
result of the audit, along with flie date and time of the audit, for presentation to users within the user 
interfece 40. 
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101991 In addition to basic auditing to monitor changes that arise from outside the system, the 
system monitors the linkage of policies 34 to PD configurations 1 6 in order to ensure that the 
design of the network remains as the user intends. This process is depicted in FIG. 25. La step 
2510, a list of netwo± devices is created. The system then selects each device in turn (step 2530) 
until the list is exhausted (step 2520). At step 2540, the list of policy linkages 35 is retrieved, 
followed by retrieval of the device's running configuration in step 2550. In step 2560, the runnmg 
configuration is parsed by flie DLS 44 into a PD configuration 16. The system them selects each 
policy linkage 35 for the device in turn (step 2575) until the Hst for that device is exhausted (step 
2570). At step 2580, we examine the PD configuration 1 6 to determine whether the policy 34 
represaited by the selected policy linkage 35 is present. If it is presrat, we simply move on to ttie 
next policy linkage. If the policy is not present, we add the policy linkage 35 to a list of missing 
policies (step 2585). When all devices have been processed in fliis manner, we record the hst of 
missing policies as the result of the audit (step 2590). An empty Ust indicates fliat all poUcies 34 
are present where the users of ttie system intaid fliem to be. A Kst with missing policy linkages 35 
indicates that some devices have departed (through manual changes or mistakes in editing within 
the system) fi-om their intended design. 

[02001 The system incorporates a reporting engine which allows the user to perform standard types 
of usage reports. Typical reports would be lists of devices requiring updated configurations, the 
results of basic or policy linkage audits, lists of tasks broken down by user, reports on job and task 
completion and schedules, and the history of any device, poUcy, or component in the database. 
These reporting engines are well-known in the art. 

[02011 In the foregoing specification, the invention has been described with reference to specific 
embodiments thereof. It will, however, be evident that various modifications and changes may be 
made thereto without departing from tiie broader spirit and scope of tiie invention. For example, 
the reader is to understand that the specific ordering and combination of process actions shown in 
the process flow diagrams described herein is merely illustrative, and the invention can be 
performed using different or additional process actions, or a different combination or ordering of 
process actions. Features and processes known to those of ordinary skill in the art of network 
management systems may similarly be incorporated as desired. Additionally, features may be 
added or subtracted as desired. The specification and drawings are, accordingly, to be regarded in 
an illustrative rather than restrictive sense, and the invoition is not to be restricted or limited except 
in accordance with the foUowmg claims and their legal equivalents. 
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WE CLAIM: 

1 . A network management system for managing a plurality of netwoik devices, comprising: 

a device database for storing a native conjBguration for one of the plurality of network 
devices; 

a component database for storing configuration information used to configure the plurality 
of netwoik devices, wherein the configuration information is stored as a plurality of compon^ts 
and a plurality of candidate components; 

a device learning module for receiving the native configuration from the device database, 
identifying the configuration information &om the native configuration, and storing the 
configuration information in the component database; and 

a grammar builder for receiving a candidate component, resolving the candidate component 
into a component, and storing the component in the component database. 

2. The system of claim 1 , wherein the configuration information identified firom the native 
configuration information comprises a policy. 

3. The system of claim 2, wherein the pohcy comprises a plurality of components. 

4. The system of claim 2, wherein the policy comprises a template ad^ted to aggregate together a 
plurality of components all sharing a common attribute. 

5. The system of claim 4, wherein the common attribute comprises membership in a netwoik 
configuration fimction for one of the phuality of netwoik devices. 

6. The system of claim 1, wherein ttie plurality of components comprise a policy-driven 
configuration for one of the pluraUty of network devices. 

7. The system of claim 1, wh^ia each component comprises a ccmiponent syntax block. 

8. The system of claim 1, wherem each conqKuient comprises a configuration command for a 
configuration command language, the configuration command adapted to configure a fimction 
on the network device. 

9. The system of claim 1 , wherein the configuration information identified fix>m the native 
configuration iafoimation comprises a candidate component. 

10. The system of claim 1, wherein the grammar builder recdves the candidate conqwneat &om 
the device learning module. 

1 1 . The system of claim 1 , wherein the grammar builder receives the candidate conq>onent &om 
the component database. 

12. The system of claim 1, wherein the device learning module fiirther comprises a lexer adapted to 
tokenize the native configuration, and a parser adapted to parse the tokenized native 
configuration to identify a plurality of conqxments and a candidate component within fho 
tokenized native configuration. 
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13. The system of claim 12, wherein the parser is configured to identify the plurality of 
components and the candidate component by comparing the tokoiized native configuration to 
the configuration information stored in the component database. 

14. The system of claim 13, wherein the parser is adapted to identify the candidate component by 
detecting a portion of the tokenized native configuration that does not match the configuration 
information stored in the component database. 

15. The system of claim 13, wherein the parser is ads^ted to identify the pluraKty of components 
by detecting portions of the tokenized native configuration that match the configuration 
information stored in the conq)onent database. 

16. The system of claim 12, wherem the lexer is configured to tokenize the native configuration 
according to a granmiar embodied in the configuration information stored in the con^wnmt 
database. 

17. The system of claim 12, wherein at least a first portion of the lexer and the parser are 
recompiled each time the device learning module receives a native configuratioa 

1 8. The system of claim 17, wherein a second portion of the lexer and the parser remains constant 
across multiple receptions of native configurations. 

19. The system of claim 12, wherein the device learning module further comprises apolicy 
matcher, adqjted to parse the plurality of components and recognize a policy contained in flie 
plurality of components. 

20. The system of claim 19, wherein the device learning module is adapted to output a policy- 
driven configuration and zero or more candidate components. 

21 . The system of claim 1 , wherein the device learning module is ad^ted to compile a policy- 
driven configuration into a native configuration. 

22. The system of claim 1, fiulher comprising a device data storage for storing device-specific 
information about the plurality of network devices. 

23. The system of claim 1, wherein the grammar builder is adapted to retrieve conunand root 
completion information fix>m one of the plurality of network devices, create a plurality of 
command root derived components firom the command root completion information, and 
resolve the candidate component into a component by comparing the candidate component to 
tiie plurality of command root derived components, to identify one of the plurality of command 
root derived components that matches the candidate compon^t. 

24. A method of parsing a native configuration into a policy-driven configuration, comprising: 

receiving configuration information comprising a plurality of components; 

receiving the native configuration; 

tokenizing flie native configuration using a lexer module; 
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parsing the tokenized native conjBguration using a parser module, to identify ajplurality of 
input components contained in ttie native configuration and match the phirality of input 
components with the plurahty of componoits; 

parsing the tokenized native configuration using the parser module, to identify one or more 
unknown regions contained m the native configuration, which do not match any of the plurality of 
components; 

emitting a tree of components, comprising the plurality of matched iiq>ut components and 
the one or more unknown regions; 

processmg the one or more unknown regions to identify one or more candidate 

components; 

analyzing the tree of components to identify one or more policies present in the tree of 

components; and 

outputting the one or more policies and the one or more candidate conq)Qnents, as a policy- 
driven configuration, 

25. The method of claim 24, further comprising: 

compiling the lexer module using the configuration information, such that the lexer module 
is adapted to tokenize the native configuration according to a grammar embodied in the 
configuration information; 

compiling the parser module using the configuration information, such that the parser 
module is adapted to match the tokenized native configuration to the configuration mformation; 

26. The method of claim 24, fiirther comprising reorganizing the tree of components to enhance 
user comprehension of the tree. 

27. The method of claim 24, fijrther comprising compressing the tree to remove unnecessary data 
firom the tree. 

28. The method of claim 27, wherein the unnecessary data comprises an empty node in the tree. 

29. The method of claim 24, wherein the plurality of components comprises a policy, and 
analyzing the tree of components comprises: 

retrieving the policy, 

comparing each of the plurality of components in the policy with the pluraUty of matched 
input components contained in the tree of components; 

aborting the analysis if any of the phurality of components in the policy is not found in the 
pluraUty of matched input components; and 

identifymg the policy as present in the tree of components if all of the plurality of 
components in the policy are found in the plurality of matched input components. 
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30. ilie method of claim 29, further comprising removing all of the components in the phirality of 
matched input components that were matched to the plurality of conq}onents in the policy, and 
inserting the policy into the tree of con5)on©Qts. 

31 . Hie method of claim 29, wherdn the policy comprises an ordered policy, and identi^ing the 
policy further comprises identifying the ordered policy as present in the tree of conqwnents if 
all of flie plurality of components in the ordered pohcy are found in the phirality of matched 
iiq)ut components in the same order as in the ordered pohcy. 

32. The method of claim 24, wherein parsing the tok^zed native configuration to identify one or 
more unknown regions comprises: 

detecting a parse error, wherem Qie parser fails to recognize a portion of the tokenized 
native configuration; 

marking the portion for fiirther processing; and 
continuing to parse the tokenized native configuration. 

33. The method of claim 24, wherein processing the one or more unknown regions to identify one 
or more candidate conq}onents comprises: 

walking the tree of components; 

marking each of the plurality of matched input components in the tree of components; 
re-walldng the tree of components; 

identifying a token within one of the one or more unknown regions within the tree of 
components. 

walking the tree of components in each of two directions, beginning at the identified token, 
to identify a beginning point and an end point of the unknown region; and 
marking the unknown region as a candidate component. 
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